Constraint-Based Mining of Episode Rules and Optimal Window Sizes

نویسندگان

  • Nicolas Méger
  • Christophe Rigotti
چکیده

Episode rules are patterns that can be extracted from a large event sequence, to suggest to experts possible dependencies among occurrences of event types. The corresponding mining approaches have been designed to find rules under a temporal constraint that specifies the maximum elapsed time between the first and the last event of the occurrences of the patterns (i.e., a window size constraint). In some applications the appropriate window size is not known, and furthermore, this size is not the same for different rules. To cope with this class of applications, it has been recently proposed in [2] to specifying the maximal elapsed time between two events (i.e., a maximum gap constraint) instead of a window size constraint. Unfortunately, we show that the algorithm proposed to handle the maximum gap constraint is not complete. In this paper we present a sound and complete algorithm to mine episode rules under the maximum gap constraint, and propose to find, for each rule, the window size corresponding to a local maximum of confidence. We show that the extraction can be efficiently performed in practice on real and synthetic datasets. Finally the experiments show that the notion of local maximum of confidence is significant in practice, since no local maximum are found in random datasets, while they can be found in real ones.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A new constraint for mining sets in sequences1

Discovering interesting episodes is a popular area in temporal or sequential data mining, examples of which are mining text or protein sequences. In such data, the order in which the events appear is being analysed and the user’s goal is to identify the regularities that may appear in the dataset, consisting of one or more sequences. The usual approach to episode discovery is to look for episod...

متن کامل

Optimizing Membership Functions using Learning Automata for Fuzzy Association Rule Mining

The Transactions in web data often consist of quantitative data, suggesting that fuzzy set theory can be used to represent such data. The time spent by users on each web page is one type of web data, was regarded as a trapezoidal membership function (TMF) and can be used to evaluate user browsing behavior. The quality of mining fuzzy association rules depends on membership functions and since t...

متن کامل

Evaluation of effective factors in window optimization of fry analysis to identify mineralization pattern: Case study of Bavanat region, Iran

The known ore deposits and mineralization trends are important key exploration criteria in mineral exploration within a specific region. Fry analysis has conventionally been considered as a suitable method to determine the mineralization trends related to linear structures. Based upon literature sources, to date, no investigation has been carried out that includes the Sensitivity Analysis of Fe...

متن کامل

Discovery of Frequent Episodes in Event Logs

Lion’s share of process mining research focuses on the discovery of end-to-end process models describing the characteristic behavior of observed cases. The notion of a process instance (i.e., the case) plays an important role in process mining. Pattern mining techniques (such as frequent itemset mining, association rule learning, sequence mining, and traditional episode mining) do not consider ...

متن کامل

Introducing an algorithm for use to hide sensitive association rules through perturb technique

Due to the rapid growth of data mining technology, obtaining private data on users through this technology becomes easier. Association Rules Mining is one of the data mining techniques to extract useful patterns in the form of association rules. One of the main problems in applying this technique on databases is the disclosure of sensitive data by endangering security and privacy. Hiding the as...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2004